Impediments to DPA
There are a number of technical impediments to the effective adoption and use of digital repositories. The main ones are cost/time impediments and the technology-related impediments. These will affect the scope of the data that is deposited for a given project or endeavor. Investigators are sure to contemplate the tradeoffs between the costs in time and money of depositing a given set of data and the benefits to the investigator and to the field more broadly. We believe that these tradeoffs are likely to be evaluated differently by subdiscipline.
To the extent that these tradeoffs are actively evaluated we need to change reward structures (e.g., though grant or publication incentives or requirements) to encourage deposit for data. More broadly we need to change disciplinary norms about what constitutes responsible professional behavior with respect to depositing different classes of data. Professional societies can play an active role in this regard. Other ways of encouraging deposit will be to require attributions of credit—or better, formal citation—of deposited data and professional valuation of these citations as we value ordinary publication citations.
Diminishing the disincentives to deposit would be accomplished by maximizing ease of use and by low cost. However, even with software tailored to streamline use, there will be a necessary tradeoff between the time investment required and the quality of the metadata and data obtained. Finally, prominent and compelling examples will be invaluable in demonstrating the scholarly value of deposit.
In this context, it is important to distinguish between “new” and “legacy” data. For projects that are just starting, digital archiving is a much simpler problem. The costs of archiving can be built into the project as well as the procedures, metadata standards, and the identification of the ultimate repository. Projects that are complete or that are on-going present a very different set of problems. The data were not collected with digital archiving in mind and often the investigators are dead or incapable of placing the data in acceptable formats or creating the needed metadata to make them useable. Even in cases in which the investigator is willing to invest the time and energy, there is great difficulty obtaining financial support. The two situations are qualitatively different and require very different solutions. Solving the archiving issues for new projects is simpler and easier and should proceed first. Professional societies and funding agencies should set guidelines for new projects and begin to enforce them at the same time they tackle the much more difficult issues involved with legacy data.
Repositories must have secure platforms with strong safeguards to prevent access to sensitive materials by individuals who should not be authorized for access. This demands not only a login but also ways of reliably authenticating user credentials. It was generally but not universally accepted in the full group that a login should be required even for access to material that is not in some way restricted. User agreements, informed by professional ethics, will need to be established by the repositories.
As noted in the OAIS standard (CCSDS 650.0-B-1) for a digital repository and reference model for a digital information object, storage, is one of six interconnected components (Ingest, Administration, Data Management, Access, and Preservation Planning) of the reference architecture. No component stands alone, and it is important to approach this subject as an interconnected web linking various issues.
There is a steep learning curve to understand these technologies and the cost to hire developers is very expensive. One way to overcome these challenges is to appeal to granting agencies to provide additional support to build specialized systems based upon open source technologies that could be leveraged by other anthropological research projects. Although repositories have mostly the same functionality there are important differences in how the systems represent stored data that is technically referred to as a data model. Just as the ability to search and discover is tightly bound to the representation of data the ability to preserve data is tightly coupled to a data model that facilitates preservation planning and preservation treatments.
[Previous: Issues and Problems] [Next: Best practices for storage infrastructure]